Explicit Knowledge-based Reasoning for Visual Question Answering
نویسندگان
چکیده
We describe a method for visual question answering which is capable of reasoning about contents of an image on the basis of information extracted from a large-scale knowledge base. The method not only answers natural language questions using concepts not contained in the image, but can provide an explanation of the reasoning by which it developed its answer. The method is capable of answering far more complex questions than the predominant long short-term memory-based approach, and outperforms it significantly in the testing. We also provide a dataset and a protocol by which to evaluate such methods, thus addressing one of the key issues in general visual question answering.
منابع مشابه
Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering
Many vision and language tasks require commonsense reasoning beyond data-driven image and natural language processing. Here we adopt Visual Question Answering (VQA) as an example task, where a system is expected to answer a question in natural language about an image. Current state-ofthe-art systems attempted to solve the task using deep neural architectures and achieved promising performance. ...
متن کاملExplicit Reasoning over End-to-End Neural Architectures
Many vision and language tasks require commonsense reasoning beyond data-driven image and natural language processing. Here we adopt Visual Question Answering (VQA) as an example task, where a system is expected to answer a question in natural language about an image. Current state-ofthe-art systems attempted to solve the task using deep neural architectures and achieved promising performance. ...
متن کاملMetareasoning as an Integral Part of Commonsense and Autocognitive Reasoning
In this paper we summarize our progress towards building a self-aware agent based on the definition of explicit selfawareness. An explicitly self-aware agent is characterized by 1) being based on an extensive and human-like knowledge base, 2) being transparent both in its behavior and in how the knowledge is represented and used, and 3) being able to communicate in natural language and directly...
متن کاملVisual Question Answering with Question Representation Update (QRU)
Our method aims at reasoning over natural language questions and visual images. Given a natural language question about an image, our model updates the question representation iteratively by selecting image regions relevant to the query and learns to give the correct answer. Our model contains several reasoning layers, exploiting complex visual relations in the visual question answering (VQA) t...
متن کاملFVQA: Fact-based Visual Question Answering
Visual Question Answering (VQA) has attracted much attention in both computer vision and natural language processing communities, not least because it offers insight into the relationships between two important sources of information. Current datasets, and the models built upon them, have focused on questions which are answerable by direct analysis of the question and image alone. The set of su...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017